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Abstract — The past several years have witnessed a surge of 
research investigating various aspects of sparse representations 
and compressed sensing. Most of this worlt has focused on the 
iinite-dimensional setting in which the goal is to decompose 
a finite-length vector into a given finite dictionary. Underlying 
many of these results is the conceptual notion of an uncertainty 
principle: a signal cannot be sparsely represented in two 
different bases. Here, we extend these ideas and results to the 
analog, infinite-dimensional setting by considering signals that 
lie in a finitely-generated shift-invariant (SI) space. This class 
of signals is rich enough to include many interesting special 
cases such as multiband signals and splines. By adapting the 
notion of coherence defined for finite dictionaries to infinite 
SI representations, we develop an uncertainty principle similar 
in spirit to its finite counterpart. We demonstrate tightness of 
our bound by considering a bandlimited lowpass train that 
achieves the uncertainty principle. Building upon these results 
and similar work in the finite setting, we show how to find a 
sparse decomposition in an overcomplete dictionary by solving 
a convex optimization problem. The distinguishing feature of 
our approach is the fact that even though the problem is 
defined over an infinite domain with infinitely many variables 
and constraints, under certain conditions on the dictionary 
spectrum our algorithm can find the sparsest representation 
by solving a finite-dimensional problem. 



I. Introduction 

Uncertainty relations date back to the work of Weyl and 
Heisenberg who showed that a signal cannot be localized 
simultaneously in both time and frequency. This basic prin- 
ciple was then extended by Landau, Pollack, Slepian and 
later Donoho and Stark to the case in which the signals are 
not restricted to be concentrated on a single interval [1], [2], 
[3], [4]. The uncertainty principle has deep philosophical 
interpretations. For example, in the context of quantum 
mechanics it implies that a particle's position and momentum 
cannot be simultaneously measured. In harmonic analysis it 
imposes limits on the time-frequency resolution [5]. 

Recently, there has been a surge of research into dis- 
crete uncertainty relations in more general finite-dimensional 
bases [6], [7], [8]. This work has been spurred in part by the 
relationship between sparse representations and the emerging 
field of compressed sensing [9], [10]. In particular, several 
works have shown that discrete uncertainty relations can 
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be used to establish uniqueness of sparse decompositions 
in different bases representations. Furthermore, there is an 
intimate connection between uncertainty principles and the 
ability to recover sparse expansions using convex program- 
ming [6], [7], [11]. 

The vast interest in representations in redundant dictio- 
naries stems from the fact that the flexibility offered by 
such systems can lead to decompositions that are extremely 
sparse, namely use only a few dictionary elements. However, 
finding a sparse expansion in practice is in general a diffi- 
cult combinatorial optimization problem. Two fundamental 
questions at the heart of overcomplete representations are 
what is the smallest number of dictionary elements needed 
to represent a given signal, and how can one find the sparsest 
expansion in a computationally efficient manner. In recent 
years, several key papers have addressed both of these 
questions in a discrete setting, in which the signals to be 
represented are finite-length vectors [6], [7], [11], [12], [13], 
[14], [10], [8]. 

The discrete generalized uncertainty principle for pairs 
of orthonormal bases states that a vector in cannot be 
simultaneously sparse in two orthonormal bases. The number 
of non-zero representation coefficients is bounded below by 
the inverse coherence [6], [7]. The coherence is defined as 
the largest absolute inner product between vectors in each 
basis [15], [6]. This principle has been used to establish 
conditions under which a convex £i optimization program 
can recover the sparsest possible decomposition in a dictio- 
nary consisting of both bases [6], [7], [11]. These results 
where later generalized in [13], [12], [14] to representations 
in arbitrary dictionaries and to other efficient reconstruction 
algorithms [14]. 

The classical uncertainty principle is concerned with ex- 
panding a continuous-time analog signal in the time and 
frequency domains. However, the generalizations outlined 
above are mainly focused on the finite-dimensional setting. 
In this paper, our goal is to extend these recent ideas and 
results to the analog domain by first deriving uncertainty 
relations for more general classes of analog signals and 
arbitrary analog dictionaries, and then suggesting concrete 
algorithms to decompose a continuous-time signal into a 
sparse expansion in an infinite-dimensional dictionary. 

In our development, we focus our attention on continuous- 
time signals that lie in shift-invariant (SI) subspaces of L2 
[16], [17], [18]. Such signals can be expressed in terms of 



linear combinations of shifts of a finite set of generators: 

N 

=^^aa"]0^(^-"7^), (1) 

where 0^ (i), 1 < ^ < are the SI generators, and ae[n] are 
the expansions coefficients. Clearly, x{t) is characterized by 
infinitely many coefficients ai [n] . Therefore, the finite results 
which provide bounds on the number of non-zero expansion 
coefficients in pairs of bases decompositions are not immedi- 
ately relevant here. Instead, we characterize analog sparsity 
as the number of active generators that comprise a given 
representation, where the ^th generator is said to be active 
if ai[n], n E Z is not identically zero. 

Starting with expansions in two orthonormal bases, we 
show that the number of active generators in each represen- 
tation obeys an uncertainty principle similar in spirit to that 
of finite decompositions. The key to establishing this relation 
is in defining an analog coherence between the two bases. 
Our definition replaces the inner product in the finite setting 
by the largest spectral value of the sampled cross-correlation 
between basis elements, in the analog case. The similarity 
between the finite and infinite cases can also be seen by 
examining settings in which the uncertainty bound is tight. In 
the discrete scenario, the lower uncertainty Umit is achieved 
by decomposing a spike train into the spike and Fourier 
bases, which are maximally incoherent [4]. To generalize 
this result to the analog domain we first develop an analog 
spike-Fourier pair and prove that it is maximally incoherent. 
The analog spike basis is obtained by modulations of the 
basic lowpass filter (LPF), which is maximally spread in 
frequency. In the time domain, these signals are given by 
shifts of the sine function, whose samples generate shifted 
spikes. The discrete Fourier basis is replaced by an analog 
Fourier basis, in which the elements are frequency shifts 
of a narrow LPF in the continuous-time frequency domain. 
Tightness of the uncertainty relation is demonstrated by 
expanding a train of narrow LPFs in both bases. 

We next address the problem of sparse decomposition in 
an overcomplete dictionary, corresponding to using more 
than N generators in ([Til. In the finite setting, it can be 
shown that under certain conditions on the dictionary, a 
sparse decomposition can be found using computationally 
efficient algorithms such as £i optimization [19], [7], [11], 
[9]. However, directly generalizing this result to the analog 
setting is challenging. Although in principle we can define 
an £i optimization program similar in spirit to its finite 
counterpart, it will involve infinitely many variables and 
constraints and therefore it is not clear how to solve it in 
practice. Instead, we develop an alternative approach by 
exploiting recent results on analog compressed sensing [20], 
[21], [22], [23], that leads to a finite-dimensional convex 
problem whose solution can be used to find the analog sparse 
decomposition. Our algorithm is based on a three-stage 
process: In the first step we sample the analog signal ignoring 
the sparsity, and formulate the decomposition problem in 
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terms of sparse signal recovery from the given samples. In 
the second stage, we exploit results on infinite measurement 
models (IMV) and multiple measurement vectors (MMV) 
[24], [22], [25], [26] in order to determine the active gen- 
erators, by solving a finite-dimensional convex optimization 
problem. Finally, we use this information to simultaneously 
solve the resulting infinite set of equations by inverting a 
finite matrix [27]. Our method works under certain technical 
conditions, which we elaborate on in the appropriate section. 
We also indicate how these results can be extended to more 
general classes of dictionaries. 

The paper is organized as follows. In Section|ll]we review 
the generalized discrete uncertainty principle and introduce 
the class of analog signals we will focus on. The analog 
uncertainty principle is formulated and proved in Section Hill 
In Section HyI we consider a detailed example illustrating the 
analog uncertainty relation and its tightness. In particular we 
introduce the analog version of the maximally incoherent 
spike-Fourier pair Sparse decompositions in two orthonor- 
mal analog bases are discussed in Section [Vl These results 
are extended to arbitrary dictionaries in Section [Vll 

In the sequel, we denote signals in L2 by lower case 
letters e.g., x{t), and SI subspaces of L2 by A. Vectors 
in are written as boldface lowercase letters e.g., x, 
and matrices as boldface uppercase letters e.g., A. The ith 
element of a vector x is denoted a;, . The identity matrix of 
appropriate dimension is written as I. For a given matrix 
A, A"^, A^ are its transpose and conjugate transpose 
respectively, A( is its ith column, and A^ is the fth row. 
The standard Euclidean norm is denoted ||x|l2 = V x^x, 
||x||i = X^i is the £1 norm of x, and ||x||o is the 
cardinality of x namely the number of non-zero elements. 
The complex conjugate of a complex number a is denoted 
a. The Fourier transform of a signal x{t) in L2 is defined as 
X{u!) = x{t)e^^'^^dt. We use the convention that upper 
case letters represent Fourier transforms. The discrete-time 
Fourier transform (DTFT) of a sequence x[n] in £2 is defined 
by X(e^") = Er=-oo a^We"^"". To emphasize the fact 
that the DTFT is 27r-periodic we use the notation X(e^'^). 

II. Problem Formulation 

A. Discrete Uncertainty Principles 

The generalized uncertainty principle is concerned with 
pairs of representations of a vector x S in two different 
orthonormal bases [6], [7]. Suppose we have two orthonor- 
mal bases for R^: {cf)^, 1 < £ < A^} and {i/j^, 1 < ^ < iV}. 
Any vector x in can then be decomposed uniquely in 
terms of each one of these vector sets: 

N N 

x^^aitjye ^^beipg. (2) 
i=i 1=1 
Since the bases are orthonormal, the expansion coefficients 
are given by a( — (f>J:i<- and bi = xpjx. Denoting by \I/ 
the matrices with columns 4>i,ipf respectively, ^ can be 
written as x = €>a = \I/b, with a = $^x and b = ^f^x. 



The uncertainty relation sets limits on the sparsity of the 
decomposition for any vector x e M^. Specifically, let A ~ 
||a||o and B = ||b||o denote the number of non-zero elements 
in each one of the expansions. The generalized uncertainty 
principle [7], [6] states that 



1 



iA + B)> 



1 



(3) 



where is the coherence between the bases # and 

* and is defined by 



(4) 



The coherence measures the similarity between basis ele- 
ments. This definition was introduced in [15] to heuristically 
characterize the performance of matching pursuit, and later 
used in [6], [7], [12], [14] in order to analyze the basis 
pursuit algorithm. 

It can easily be shown that < /^(^jVP) < 1 

[6]. The upper bound follows from the Cauchy-Schwarz 
inequality and the fact that the bases elements have norm 
1. The lower bound is the result of the fact that the matrix 
M = $^'3/ is unitary and consequently M^M = Ijv- This 
in turn implies that the sum of the squared elements of M 
is equal to N. Since there are N'^ variables, the value of the 
largest cannot be smaller than 1/^/N. The lower bound of 
1/y/N can be achieved, for example, by choosing the two 
orthonormal bases as the spike (identity) and Fourier bases 
[4]. With this choice, the uncertainty relation (|3]l becomes 



A + B>2y 



(5) 



Assuming ^/N is an integer, the relations in Q are all 
satisfied with equality when x is a spike train with spacing 
Vn, resulting in \/N non-zero elements. This follows from 
the fact that the discrete Fourier transform of x is also a spike 
train with the same spacing. Therefore, x can be decomposed 
both in time and in frequency into \/N basis vectors. 

As we discuss in Section [V] the uncertainty relation 
provides insight into how sparse a signal x can be rep- 
resented in an overcomplete dictionary consisting of $ 
and '4'. It also sheds light on the ability to compute such 
decompositions using computationally efficient algorithms. 
Most of the research to date on sparse expansions has 
focused on the discrete setting in which the goal is to 
represent a finite-length vector x in in terms of a given 
dictionary using as few elements as possible. First general 
steps towards extending the notions and ideas underlying 
sparse representations and compressed sensing to the analog 
domain have been developed in [20], [22], [23], [28]. Here 
we would like to take a further step in this direction by 
extending the discrete uncertainty principle to the analog 
setting. 

B. Shift-Invariant Signal Expansions 

In order to develop a general framework for analog uncer- 
tainty principles we first need to describe the set of signals 



we consider A popular model in signal and image processing 
are signals that lie in SI spaces. A finitely generated SI 
subspace in L2 is defined as [16], [17], [18]: 

A = lx{t) =J2J2 Mn]Mt - nT) : ae[n] e . (6) 

The functions (j>e{t) are referred to as the generators of A. 
Examples of SI spaces include multiband signals [20], [23] 
and spline functions [29], [27]. Expansions of the type ^ 
are also encountered in communication systems, when the 
analog signal is produced by pulse amplitude modulation. In 
the Fourier domain, we can represent any x{t) E A as 



N 

X{oj) = Y,A,{e^-^: 
1=1 



where 



A,{e^^^) = Y,a,[n]e 



-juinT 



(7) 



(8) 



is the DTFT of ai\n] at frequency uT, and is 271 /T periodic. 

In order to guarantee a unique stable representation of 
any signal in by a sequence of coefficients ag[n], the 
generators 0^ {t) are typically chosen such that the functions 
{(l)i{t - nT),n e Z,l < £ < N} form a Riesz basis for 
L2- This means that there exist constants a > and (3 < 00 
such that 

2 



a a 



< 



N 



ai,[n](j)t{t - nT) 



<PM\ (9) 



where ||a|p X]f=i X^nez I'^^WP' ^^'^ the norm in the 
middle term is the standard L2 norm. Condition (|9]l implies 
that any x{t) ^ A has a unique and stable representation in 
terms of the sequences a([n]. By taking Fourier transforms 
in (|9]l it follows that the shifts of the generators 4'i{t) form 
a Riesz basis if and only if [17] 



vl ^ M^^(e^") < [31, a.e. 



(10) 



where 



^0101 (e^"^) 



Rd 



(11) 

and for any two functions (f){t),tp{t) with Fourier transforms 



Rtj,^ (e^ 



feGZ 



27r. 



k ^ 



27r, 
T 



k . (12) 



Note that Ripxp{e^'^) is the DTFT of the cross correlation 
sequence r^^[n] = {4){t — nT),'4){t)), where the inner 
product on L2 is defined as 



s{t),x{t)) = 



3{t)x{t)dt. 



(13) 



In Section|Vl]we consider overcomplete signal expansions 
in which more than N generators (j)i{t) are used to represent 
a signal x{t) in A. In this case ^ can be generalized to allow 
for stable overcomplete decompositions in terms of a frame 
for A. The functions {ipiit- nT),n e Z,l < £ < M} form 
a frame for the SI space A if there exist constants a > 
and P < oo such that 

M 

e=i nez 

(14) 

for all x{t) e A, where \\x{t)\\l = {x{t),x{t)). 

Our main interest is in expansions of a signal x{t) in a 
SI subspace A of L2 in terms of orthonormal bases for A. 
The generators {(piit)} of A form an orthonormal basi|3 if 

{(f)i{t - nT), (j)r{t - mT)) = (5„m5fr, (15) 

for all i,r,n, m, where (5„,„ = 1 if n = m 
and otherwise. Since {(j)i{t — nT),(pr{t ~ mT)) — 
{4>e{t — (n — m)T), (t>r{t)), (fTsl l is equivalent to 

{(l)i{t - nT), 0r(i)> = 5no5ir- (16) 

Taking the Fourier transform of ( fTSI l. the orthonormality 
condition can be expressed in the Fourier domain as 

R4>,^Ae-n = Sir. (17) 

Given an orthonormal basis {4)i{t — nT)} for A, the 
unique representation coefficients at[n] in (|6]l are given by 
ai[n] — {4)i{t — nT), x{t)). This can be seen by taking the 
inner product of x{t) in (|6]l with ^^(^ ^ mT) and using 
the orthogonality relation (flSl l. Evidently, computing the 
expansion coefficients in an orthonormal decomposition is 
straightforward. There is also a simple relationship between 
the energy of x{t) and the energy of the coefficient sequence 
in this case, as incorporated in the following proposition: 

Proposition 1: Let {4)g{t),l < t < A^} generate an 
orthonormal basis for a SI subspace A, and let x{t) ~ 
Efci E„ez ai{n\(^i{t - nT). Then 

\W)f = Y f ^ J2\Me'^^)\' du;, (18) 

where \\x{t)\\l = {x{t),x{t)) and Ae{e^'^) is the DTFT of 

ai[n]. 

Proof: See Appendix U ■ 

C. Analog Problem Formulation 

In the finite-dimensional setting, sparsity is defined in 
terms of the number of non-zero expansion coefficients in 
a given basis. In an analog decomposition of the form (dJ, 
there are in general infinitely many coefficients so that it is 

'Here and in the sequel, when we say that a set of signals {0^(t)} form 
(or generate) a basis, we mean that the basis functions are {0^(4— riT), n £ 

i,i<e.<N}. 
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not immediately clear how to define the notion of analog 
sparsity. 

In our development, analog sparsity is measured by the 
number of generators needed to represent x{t). In other 
words, some of the sequences ai [n] in ([T]l may be identically 
zero, in which case 

x{t) ^ Mn]Mt - nT), (19) 

\l\=AneZ 

where the notation \£\ — A means a sum over at most A 
elements. Evidently, in our definition, sparsity is determined 
by the energy of the entire sequence ae[n] and not by the 
values of the individual elements. 

In general, the number of zero sequences depends on the 
choice of basis. Suppose we have an alternative representa- 
tion 

xit) = E E Mn]Mt ~ nT), (20) 

\t\=Bne1 

where {-i/'f (t)} also generate an orthonormal basis for A. An 
interesting question is whether there are limitations on A and 
B. In other words, can we have two representations that are 
simultaneously sparse so that both A and B are small? This 
question is addressed in the next section and leads to an 
analog uncertainty principle, similar to (O. In Section |IV] 
we prove that the relation we obtain is tight, by constructing 
an example in which the lower limits are satisfied. 

As in the discrete setting we expect to be able to use fewer 
generators in a SI expansion by allowing for an overcomplete 
dictionary. In particular, if we expand x(t) using both sets 
of orthonormal bases we may be able to reduce the number 
of sequences in the decomposition beyond what can be 
achieved using each basis separately. The problem is how 
to find a sparse representation in the joint dictionary in 
practice. Even in the discrete setting this problem is NP- 
complete. However, results of [7], [13], [12], [14] show 
that under certain conditions a sparse expansion can be 
determined by solving a convex optimization problem. Here 
we have an additional essential complication due to the fact 
that the problem is defined over an infinite domain so that 
it has infinitely many variables and infinitely many con- 
straints. In Section[V]we show that despite the combinatorial 
complexity and infinite dimensions of the problem, under 
certain conditions on the bases functions, we can recover a 
sparse decomposition by solving a finite-dimensional convex 
optimization problem. 

III. Uncertainty Relations in SI Spaces 

We begin by developing an analog of the discrete uncer- 
tainty principle for signals x(t) in SI subspaces. Specifically, 
we show that the minimal number of sequences required to 
express x[t) in terms of any two orthonormal bases has to 
satisfy the same inequality (O as in the discrete setting, with 
an appropriate modification of the coherence measure. 

Theorem 1: Suppose we have a signal x(t) £ A where 
^ is a SI subspace of £2- Let {(j^eit), I < i < N} and 



{ipe{t), 1 < £ < N} denote two orthonormal generators 
of A, so that x{t) can be expressed in both bases with 
coefficient sequences a^[n],6£[n]: 

x{t) = ^ ai[n](l)i{t-nT) ^ ^ bi[n]ilji{t-nT). 

\e\=Ane1 \l\=Bn£Z 

(21) 
(22) 

(23) 



Then, 



where 



-{A + B)> y/AB > — ^— , 
2^ '- "Ail*,*)' 



= maxesssup |i?0^^^(e-''^)| 

ui 



and R^^ie^'^) is defined by O- 

The coherence ^ff) of ( l23T l is a generalization 

of the notion of discrete coherence (|4|i defined for 
finite-dimensional bases. To see the analogy, note that 
J?0i/)(e^") is the DTFT of the correlation sequence 
r,p^[n] ~ {4){t — nT),ip{t)). On the other hand, the finite- 
dimensional coherence can be written as = 

(1/A^) maxf^r \<i>e V'rl' where x is the discrete Fourier trans- 
form of X and N is the length of x. 

Proof: Without loss of generality, we assume that 
||-^(i)||2 = 1- Since {(pi{t)} and {ipi{t)} both generate 
orthonormal bases, we have from Proposition [T] that 



\i\=A 



2n 



(24) 



|f| = B 



Using the norm constraint and expressing X{uj) once in 
terms of <f>£(ci;) and once in terms of 'i'i{uj): 

1 = - / iXicj^dcj 

271" J-oo 

1 r°o _ 



\r\=B 



T 



\e\=A 

\r\=B 



inequality is a result of (l23l l. Applying the Cauchy-Schwarz 
inequality to the integral in ( l25T l we have 

1 ^ 




Using the same inequality we can upper bound the sum in 

2 

|2 



<aY \M^""^) 



^\e\=A I \i\=A 

Combining with (|26ll, (|25]l and (|24|i leads to 
1 < ^($,*)%/AB. 



(27) 



(28) 



Using the well-known relation between the arithmetic and 
geometric means completes the proof. ■ 
An interesting question is how small ^) can be made 
by appropriately choosing the bases. From Theorem [T] the 
smaller //(<i>, the stronger the restriction on the sparsity 
in both decompositions. As we will see in Section[V] such a 
limitation is helpful in recovering the true sparse coefficients. 
In the finite setting we have seen that 1 / \fN < < 1 

[6]. The next theorem shows that the same bounds hold in 
the analog case. 

Theorem 2: Let {^^(i),! < i < N} and {-0^^), 1 < 
i < N} denote two orthonormal generators of a SI subspace 
A C L2 and let — max^^^ esssup |i?^^^^(e-'")|, 

where i?^ 



i0;/j(e"''^) is defined by ( fT2] i. Then 



2n 



\e\=A 

\r\=B 



< 



27r 



-= < < 1. (29) 

Proof: We begin by proving the upper bound, which 
follows immediately from the Cauchy-Schwarz inequality 
and the orthonormality of the bases: 

\R<p,^Aen\ < (^0^0.(e^")fi^,-Vv(e^"))'^' = 1, (30) 

where the last equality is a result of ( fTTI l. Therefore, 
< 1. 

To prove the lower bound, note that since <t>e{t) is in A 
for each £, we can express it as 

N 

Mt) = Y.Y."'rMMt"nT) (31) 
for some coefficients af.[n], or in the Fourier domain, 

N 

<i>,H = E^'(^'"^)*'-H- (32) 



\e\=A 



\r\=B 



The third equality follows from rewriting the integral over 
the entire real line as the sum of integrals over intervals 
of length 27r/T as in ( |1091 l in Appendix [H and the second 



Since ||0f(i)|| = 1 and {V'r (^)} are orthonormal, we have 
from Proposition [T] that 



T 
2^ 



N 

E 

r=l 



= 1<£<N. 



(33) 



Now, using ( l32b and the orthonormality condition (Tl\ it 
follows that 



Ml',*) 



Therefore, 



N 



27r N 

E 

N 



2tt n 



= ^ / "^|^f(e^")|'dco^27rAr, (35) 

where the last equality follows from (l33T l by performing a 
change of variables uj' = coT in the integral. If ^) < 
1/VN, then, |i?0,.0,(e^")| < l/\/iV a.e. on w and 



27r w 



(36) 



which contradicts ([35]). ■ 
It is easy to see that the lower bound in ( |29] l is achieved 
if R^i^^{e^^) = 1/Vn for all ^, r and uj. In this case the 
uncertainty relation (l22l i becomes 



as illustrated in the right-hand side of the figure. The rest of 
the basis elements are obtained by shifts in frequency. 

We now construct two orthonormal bases for A with 
minimal coherence by mimicking these properties in the 
continuous-time Fourier domain. Since we are considering 
the class of signals bandlimited to ttN/T, we only treat 
this frequency range. As we have seen, the basic element 
of the spike basis occupies the entire frequency spectrum. 
Therefore, we choose our first analog generator (j>i{t) to 
be constant over the frequency range {—ttN/T, ttN/T]. The 
remaining generators are obtained by shifts in time of (pi (t) 
or modulations in frequency: 



T^-Mi-i)T/N^ ^ g (-ttAT/T, ttN/T]; 



otherwise, 



(38) 



corresponding to 



A + B> 2%/AB > 2\/lv. (37) 

As discussed in Section |ll] in the discrete setting with ^/N 
an integer, the inequalities in ( l37T i are achieved using the 
spike-Fourier basis and x equal to a spike train. In the next 
section we show that equality in dJTl ) can be satisfied in the 
analog case as well using a pair of bases that is analogous to 
the spike-Fourier pair, and a bandlimited signal x{t) equal 
to a lowpass train. 

IV. Achieving the Uncertainty Principle 
A. Minimal Coherence 

Consider the space A of real signals bandlimited to 
{—TrN/T,TrN/T]. As we show below, any signal in A can 
be expressed in terms of N SI generators. We would like 
to choose two orthonormal bases, analogous to the spike- 
Fourier pair in the finite setting, for which the coherence 
achieves its lower limit of 1/^/N. To this end, we first 
highlight the essential properties of the finite spike-Fourier 
bases in C^, and then choose an analog pair with similar 
characteristics. 

The basic properties of the spike-Fourier pair are illus- 
trated in Fig. [T] The first element of the spike basis, (pi, 
is equal to a constant in the discrete Fourier domain, as 
illustrated in the left-hand side of Fig. [T] The remaining 
basis vectors are generated by shifts in time, or modulations 
in frequency, as depicted in the bottom part of the figure. 
In contrast, the first vector of the Fourier basis is sparse in axis; therefore, 
frequency: it is represented by a single frequency component 1 / Vn. 



Mt) = \jY «inc((< - - 1)T')/T'), (39) 

with T' — T/N. The normalization constant is chosen to 
ensure that the basis vectors have unit norm. With slight 
abuse of terminology, we refer to the set 1 < i < 

as the analog spike basis (the basis is actually constructed 
by shifts of this set with period T). Note that the samples of 
(t>t{t) at times nT' create a shifted spike sequence, further 
justifying the analogy. The Fourier transform of the analog 
spike basis is illustrated in the left-hand side of Fig. |2] 

To construct the second orthonormal basis, we choose 
i!i{t) to be sparse in frequency, as in the finite case. The 
remaining generators are obtained by shifts in frequency. 
To ensure that the generators are real we must have that 
t{uo) ~ 'i!t{—ijj). Therefore, we consider only the interval 
[Q^nN/T]. Since we have N real generators, we divide this 
interval into equal sections of length tt/T, and choose each 
^'^(cj) to be constant over the corresponding interval, as 
illustrated in Fig. |2] More specifically, let 

I,^{uj: e W - l)/r, TTl/T]}, (40) 

be the £th interval. Then 



T, 



0, 



u) e If, 

otherwise. 



(41) 



The analog pair of bases generated by ^^(cj), 1 < 

I < N} is referred to as the analog spike-Fourier pair. In 
order to complete the analogy with the discrete spike-Fourier 
bases we need to show that both analog sets are orthonormal 
and generate A, and that their coherence is equal to 1/ViV- 
The latter follows immediately by noting that 



0, 



LU £ X,.; 
otherwise. 



(42) 



It is easy to see that replicas of Tr at distance 27r/T will not 
overlap. Furthermore, these replicas tile the entire frequency 

l^0.^.(e^")l = ^/VN, and ^($,*) = 



1 «' 



OJ 



i 



N-1 



01 



N-1 



01 



N-1 



01 N-1 



Fig. 1. Discrete Fourier-domain representation of the spike-Fourier bases in C^. The left-hand side is the discrete Fourier transform of the spilce basis. 
The right-hand side represents the discrete Fourier transform of the Fourier basis. The top row corresponds to the first basis function, while the bottom row 
represents the ^th basis function. 



27TIT 



T 



-jKfc-ijr/Ar 



7tN 



7tN 




-| — 

ttN 

T 



— r 

ttN 

T 



Fig. 2. Continuous Fourier-domain representation of the analog spike-Fourier bases in J^. The left-hand side is the Fourier transform of the spike basis. 
The right-hand side represents the Fourier transform of the Fourier basis. The top row corresponds to the first generator, while the bottom row represents 
the fth generator. 



To show that {^/'^(t), 1 < t < N} generate A, note that 
any x{t) ^ A can be expressed in the form ^ (or (|7|) by 
choosing A({eJ'^'^) = X{lu) for uj e Te. If X{llj) is zero 
on one of the intervals Tg, then Aiie^'^) will also be zero, 
leading to the multiband structure studied in [20], [23]. Since 
the intervals on which "^({uo) are non-zero do not overlap, 
the basis is also orthogonal. Finally, orthonormality follows 
from our choice of scaling. 



Proving that {^^(i), 1 < I < N} generate an orthonormal 
basis is a bit more tricky. To see that these functions span 
A note that from Shannon's sampling theorem, any function 



x{t) bandlimited to vr/T' with T' = T/N can be written as 

x{t) = x{nT') sinc((t - nT')/T'). (43) 



ne2 



Substituting n = mN + £ — 1, we can replace the sum over 
n by the double sum over m E Z and 1 < ^ < iV, resulting 
in 



N 



= ^^a4TO]sinc((t-(£-l)r'-mT))/T') 



= 1 mGZ 



N 



(44) 



with ae[n] = x{{i - 1)T' + nT), proving that {(t)i{t)} 
generate A. Orthonormahty of the basis follows from 



N-l 



N 

where we used the relation 

N-l 



k=0 



(45) 



(46) 



fe=0 



B. Tightness of the Uncertainty Relation 

Given any signal x{t) in A, the uncertainty relation for the 
analog spike-Fourier pair states that the number of non-zero 
sequences in the spike and Fourier bases must satisfy (|37] |. 
We now show that when is an integer, these inequalities 
can be achieved with equality with an appropriate choice of 
x{t), so that the uncertainty principle is tight. To determine 
such a signal x{t), we again mimic the construction in the 
discrete case. 

As we discussed in Section [III when using the finite 
Fourier-spike pair, we have equalities in ( |37] | when x € 
is a spike train with \/N non-zero values, equally spaced, 
as illustrated in the left-hand side of Fig. |3] This follows 
from the fact that the spike train has the same form in both 
time and frequency. To construct a signal in A satisfying the 
analog uncertainty relation, we replace each Fourier-domain 
spike in the discrete setting by a shifted LPF of width 2n/T 
in the analog Fourier domain. To ensure that there are ^fN 
non-zero intervals of length 2ti/T in {~-kN /T.-kN /T], the 
frequency spacing between the LPFs is set to 2'k\/N /T, 
as depicted in the right-hand side of Fig. |3] This signal 
can be represented in frequency by Vlv basis functions 
with m = 2^1,1 < I < \\/N /2\, and m = 
2VN{i - 1) + 1, 1 < ^ < \VN/2'\. It therefore remains to 
be shown that x{t) can also be expanded in time using ^/N 
signals (/),„(t). 

Since x{t) is bandlimited to ttN/T, 



N 



(47) 



e=i Ti6Z 



where ae[n\ — {(j)e{t — nT), x{t)). In the Fourier domain we 
have 



Alien 



^]uj{i~l}/N 



Lo 2t: ^ 
T^ Y 



(48) 

Due to the fact that af\n] is a real sequence, Ag{e^'^) ~ 
Ae{e^^'^). Therefore we consider Ai{e^'^) on the interval 
[0, tt]. For values of ut in this interval, X{u)/T — 2Tik/T) is 
non-zero only for indices k — m^/N with [— \/]V /2 + IJ < 



m < LVA^/2J. Thus, 




[x/lV/2j 

rn— 

-l,rVJV' 



■jui2Trm(e-l)/^ 



(49) 



where r is an arbitrary integer. The last equality follows from 
(|46] | and the fact that the sum is over \/N consecutive values. 
Since 1 < ^ < A^, Ai{e'^'^) is nonzero for ViV indices I, so 
that xit) can be expanded in terms of \/N generators (j)i,{t). 

V. Recovery of Sparse Representations 

A. Discrete Representations 

One of the important implications of the discrete un- 
certainty principle is its relation to sparse approximations 
[6], [7], [13], [14]. Given two orthonormal bases 
for an interesting question is whether one can reduce 
the number of non-zero expansion coefficients required to 
represent a vector x e by decomposing it in terms of 
the concatenated dictionary 



D = [ * * ] 



(50) 



In many cases such a representation can be much sparser 
than the decomposition in either of the bases alone. The 
difficulty is in actually finding a sparse expansion x = D7 
in which 7 has as few non-zero components as possible. 
Since D has more columns than rows, the set of equations 
X = D7 is underdetermined and therefore x can have 
multiple representations 7. Finding the sparsest choice can 
be translated into the combinatorial optimization problem 



inin|j7||o 



s. t. X = D7. 



(51) 



Problem ( BTT l is NP-complete in general and cannot be solved 
efficiently. A surprising result of [6], [7], [11], summarized 
below in Proposition |2] is that if the coherence ij,{^,'^) 
between the two bases is small enough with respect to the 
sparsity of 7, then the sparsest possible 7 is unique and can 
be found by the basis pursuit algorithm. This algorithm is 
a result of replacing the non-convex norm by the convex 
ti norm: 



min||7||i 



s.t. X 



D7. 



(52) 



Proposition 2: Let D = \E'] be a dictionary consist- 
ing of two orthonormal bases with coherence /^(4>,'4') = 
max^^r I'^JV'rl- If ^ vector x has a sparse decomposition 
in D such that x = D7 and ||7||o < '^) then this 

representation is unique, namely there cannot be another 7' 
with ||7'||o < l//i(^>, ^) and x — D7'. Furthermore, if 

then the unique sparse representation can be found by 
solving the £1 optimization problem (|52] |. 
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Fig. 3. Discrete and analog signals satisfying the uncertainty principle with equality. The left-hand side is the discrete Fourier transform of the spike train. 
The right-hand side represents the analog Fourier transform of the LPF train. 



As detailed in [6], [7], the proof of Proposition |2] follows 
from the generalized discrete uncertainty principle. 

Another useful result on dictionaries with low coherence 
is that every set of < 2//i(#, — 1 columns are linearly 
independent [13, Theorem 6]. This result can be stated in 
terms of the Kruskal-rank of D [30], which is the maximal 
number q such that every set of q columns of D is linearly 
independent. 

Proposition 3: [13, Theorem 6] Let D = '4'] be a dic- 
tionary consisting of two orthonormal bases with coherence 
Then cr(D) > 2/^(*, *) - 1 where ct(D) is the 
Kruskal rank of D. 



B. Analog Representations 

We would now like to generalize these recovery results to 
the analog setup. However, it is not immediately clear how 
to extend the finite £i basis pursuit algorithm of ( |52] | to the 
analog domain. 

To set up the analog sparse decomposition problem, 
suppose we have a signal x{t) that lies in a space A, and 
let < e < N},{^Pi{t),l < £ < N} he two 

orthonormal generators of A. Our goal is to represent x{t) 
in terms of the joint dictionary {di{t — nT), 1 < £ < 2N} 
with 

"^'^'^-i ^,-N{t). N + 1<£<2N, (^^^ 

using as few non-zero sequences as possible. Denoting by 
7[n] the vector at point-n whose elements are 'ji[n], our 
problem is to choose the vector sequence 7[?i] such that 



then be written as 



2N 



2N 



(55) 



1=1 n£l 



and 7f [n] is identically zero for the largest possible number 
of indices t. 

We can count the number of non-zero sequences by first 
computing the ^2-norm of each sequence. Clearly, ^^[n] is 
equal for all n if and only if its £2 norm ||7f[n]||2 — 
(Sn l7|["]l)^^^ is zero. Therefore, the number of non-zero 
sequences ^i[n] is equal to ||c||o where q = ||7£['t-]||2- For 
ease of notation, we denote II7II2.0 = ||c||o, and similarly 
II7II2.1 = l|c||i. Finding the sparsest decomposition ( [55] ) can 



min||7|l2,o s.i. x{t)=^^-fi[n]di{t-nT). (56) 

Problem ( |56] | is the analog version of ( fSTT ). However, in 
addition to being combinatorial as its finite counterpart, ( l56b 
also has infinitely many variables and constraints. 

In order to extend the finite-dimensional decomposition 
results to the analog domain, there are two main questions 
we need to address: 

1) Is there a unique sparse representation for any input 
signal in a given dictionary? 

2) How can we compute a sparse expansion in practice, 
namely solve ( |56] |. despite the combinatorial complex- 
ity and infinite dimensions? 

The first problem is easy to answer. Indeed, the uniqueness 
condition of Proposition |2] can be readily extended to the 
analog case. This is due to the fact that its proof is based on 
the uncertainty relation ^ which is identical to ( |22] ). with 
the appropriate modification to the coherence measure. 

Proposition 4: Suppose that a signal x{t) E A has a 
sparse representation in the joint dictionary {di{t — nT),n e 
Z, 1 < £ < 2N} of ( |54] i which consists of two orthonormal 
bases {(f>i{t - nT),ilji{t - nT),n e Z,l < £ < N}. If 
the coefficient sequences jiln] of ( l55T l satisfy ||7||2,o < 
l//i(<f>, *) where ^(<i>, *) is the coherence defined by ( |23] ). 
then this representation is unique. 

The second, more difficult question, is how to find a 
unique sparse representation when it exists. We may attempt 
to develop a solution by replacing the £0 norm in ( |56] l by an 
£1 norm, as in the finite-dimensional case. This leads to the 
convex program 

2N 

min||7||2,i s.t. a;(t) ==^^74n](i^(i-nT). (57) 

i=l nSZ 

However, in practice, it is not clear how to solve ( [57] ) since 
it is defined over an infinite set of variables jein], and has 
infinitely many constraints (for all t). 

Our approach to treating the analog decomposition prob- 
lem is to first sample the signal x{t) at a high enough rate, 
so that x{t) can be determined from the given samples. We 
will then show that the decomposition problem can be recast 
in the Fourier domain as that of recovering a set of sparse 
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vectors that share a joint sparsity pattern, from the given 
sequences of samples. The importance of this reformulation 
is that under appropriate conditions, it allows to determine 
the joint support set (or the active generators) by solving 
a finite-dimensional optimization problem. Once the active 
generators are determined, the corresponding coefficient 
sequences can be readily found. 

We begin by noting that since {(j)e{t)} generate an or- 
thonormal basis for A, x{t) is uniquely determined by the 
N sequences of samples 



(58) 



where ri{t) is the convolution r({t) — (f)£{—t)*x{t). Indeed, 
orthonormality of {0£(t)} immediately implies that 



N 



= ^ ^ C£[n](j)i{t - nT). 



(59) 



1=1 n£l 



Therefore, constraining x{t) is equivalent to imposing re- 
strictions on the expansion coefficients C(\n]. Taking the 
inner products on both sides of ( |55] l with respect to t/i^ [t — 
rriT) leads to 



2N 



Cr[in] = 7^ W('/'r-(^ - mT), djjt - nT)) 

1=1 nSZ 
2N 

= '^'^'-it[n]art[m ~ n], (60) 



1=1 nel 



where a^^ [n] = {4>r{t — nT), di{t)). In the Fourier domain, 
can be written as 



2JV 



Kr < N. 



(61) 



e=i 



Thus, instead of finding je[n] satisfying the constraints 
in ( |56] l we can alternatively seek the smallest number of 
functions T({eJ'^) that satisfy (l6TT l. 

To simplify i6T[ we use the definition (l54l l of di{t). Since 
{(j>r{t — nT), <j>e{t)) = SriSno and the Fourier transform of 
{(f)r{t — nT),-ip£{t)) is equal to i?0^^j(e^"), dMT l can be 
written as 

2N 

e=N+i 

Denoting by c{e^'^),j{e^'^) the vectors with elements 
C'^(e-''^), r^(e-''^) respectively, we can express ( |62] l as 

c{e^^) = [ I M^^ieJ^) ] -f(e"^), (63) 

where M0^(e^") is the sampled cross correlation matrix 



R. 



[en 



(64) 



with defined by (fT2] i. Our sparse recovery problem 
is therefore equivalent to 



s.t. 



(65) 



ll7(e^")ll2,o 

c(eJ-) = [ I M^V'Ce^") ] 7(e^")- 

Problem ( |65] ) resembles the multiple measurement vector 
(MMV) problem, in which the goal is to jointly decompose 
TO vectors x^, 1 < i < to in a dictionary D [25], [26], [24], 
[31]. In the next section we review the MMV model and a 
recently developed generalization to the case in which it is 
desirable to jointly decompose infinitely many vectors x, in 
terms of a given dictionary D. This extension is referred to as 
the infinite measurement model (IMV) [21]. In Section IV-DI 
we show how these ideas can be used to solve (|65] |. 

As we will show, the ability to sparsely decompose a 
set of signals in the IMV and MMV settings depends 
on the properties of the corresponding dictionary. In our 
formulation (|65] l, the dictionary is given by 



D(e^-) = [ I M^.^{en ] • 



(66) 



The next proposition establishes some properties of D(e^'^) 
that will be used in Section [V-DI in order to solve ( |65l ). 

Proposition 5: Let {(/)£(t — nT), tpi{t — nT), n G Z, 1 < 
I < N} denote two orthonormal bases for a SI space A. Let 
M0^(e^'^) denote the cross-correlation matrix defined by 
(|64] |. and let ^(<i>, be the analog and discrete 

coherence measures defined by ( |23] ), (|4|i. Then, for each ui: 

1) M0^(e^") is a unitary matrix; 

2) /.(I,M^^(e^-))<A*($,*). 

Proof: See Appendix HIl ■ 

C. MMV and IMV Models 

The basic results of [7], [12], [13] on expansions in 
dictionaries consisting of two orthonormal bases can be 
generalized to the MMV problem in which we would like to 
jointly decompose to vectors x; , 1 < i < to in a dictionary 

D. Denoting by X the matrix with columns x^, our goal is 
to seek a matrix T with columns 7^ such that X = DF and 
r has as few non-zero rows as possible. In this model, not 
only is each representation vector 7^ sparse, but in addition 
the vectors share a joint sparsity pattern. The results in 
[25], [26], [24] establish that under the same conditions as 
Proposition |2l the unique F can be found by solving an 
extension of the £1 program: 



mm 

r 



|s(r) 



t. X ^ Dr. 



(67) 



Here s(r) is a vector whose £th element is equal to ||r | 
where is the ith row of F, and the norm is an arbitrary 
vector norm. When T is equal to a single vector 7, ||r^|| = 
17^1 for any choice of norm and ( l67l ) reduces to the standard 
ii optimization problem ( |52] |. 

Proposition 6: Let X be an iV x m matrix with columns 
< i < m that have a joint sparse representation in the 
dictionary D = '4'] consisting of two orthonormal bases. 



so that X = Dr with ||s(r)||o = fc. If fc < 

where = maxe^r\<pf '4'r\^ then this representation 

is unique. Furthermore, if 

V2- 



A; < 



0.5 



(68) 



then the unique sparse representation can be found by 
solving (l67b with any vector norm. 

The MMV model has been recently generalized to the 
IMV case in which there are infinitely many vectors x of 
length N, and infinitely many coefficient vectors 7: 



x(A)=D7(A), AeA, 



(69) 



where A is some set whose cardinality can be infinite. 
In particular, A may be uncountable, such as the set of 
frequencies oj € (— tt, tt]. The fc-sparse IMV model assumes 
that the vectors {7(A)}, which we denote for brevity by 
7(A), share a joint sparsity pattern, so that the non-zero 
elements are all supported on a fixed location set of size k 
[21]. This model was first introduced in [20] in the context 
of blind sampling of multiband signals, and later analyzed 
in more detail in [21]. 

A major difficulty with the IMV model is that it is not 
clear in practice how to determine the entire solution set 
7(A) since there are infinitely many equations to solve. 
Thus, using an £1 optimization, or a greedy approach, are 
not immediately relevant here. In [21] it was shown that 
(|69] l can be converted to a finite MMV without loosing 
any information by a set of operations that are grouped 
under a block refereed to as the continuous-to-finite (CTF) 
block. The essential idea is to first recover the support of 
7(A), namely the non-zero location set, by solving a finite 
MMV. We then reconstruct 7(A) from the data x(A) and 
the knowledge of the support, which we denote by S. The 
reason for this separation is that once S is known, the linear 
relation of i6% becomes invertible when the coherence is 
low enough. 

To see this, let Ds denote the matrix containing the subset 
of the columns of D whose indices belong to S. The system 
of i69[ can then be written as 



x(A)=D57''(A), AeA, 



(70) 



where the superscript 7'^ (A) is the vector that consists of 
the entries of 7(A) in the locations 5*. Since 7(A) is fc- 
sparse, \S\ < k. In addition, from Proposition [3] it follows 
that if/i($,\I') < 1/fc then every k columns of D are 
linearly independent. Therefore Dg consists of linearly in- 
dependent columns implying that DjjDg = I, where D|. — 
(DgDg) is the Moore-Penrose pseudo-inverse of 

Dg. Multiplying (iTOl i by Dg on the left gives 

7^^(A) =Dt;x(A), AeA. (71) 

The elements in 7(A) not supported on S are all zero. 
Therefore ( fTTI ) allows for exact recovery of 7(A) once the 
finite set S is correctly identified. 



In order to determine S by solving a finite-dimensional 
problem we exploit the fact that span(x(A)) is finite, since 
x(A) is of length N. Therefore, span(x(A)) has dimension 
at most N. In addition, it is shown in [21] that if there 
exists a solution set 7(A) with sparsity k, and the matrix 
D has Kruskal rank cr(D) > 2k, then every finite collection 
of vectors spanning the subspace span(x(A)) contains suffi- 
cient information to recover S exactly. Therefore, to find S 
all we need is to construct a matrix V whose range space is 
equal to span(x(A)). We are then guaranteed that the linear 
system 

V = DU (72) 

has a unique fc-sparse solution U whose row support is equal 
S. This result allows to avoid the infinite structure of i6% 
and to concentrate on finding the finite set S by solving the 
single MMV system of ( |72] |. The solution can be determined 
using an £1 relaxation of the form ( |67] | with V replacing X, 
as long as the conditions of Proposition |6] hold, namely the 
coherence is small enough with respect to the sparsity. 

In practice, a matrix V with column span equal to 
span(x(A)) can be constructed by first forming the matrix 
Q = J^g^ x(A)x-^(A)(iA, assuming that the integral exists. 
Every V satisfying Q = VV^ will then have a column 
span equal to span(x(A)) [21]. In particular, the columns 
of V can be chosen as the eigenvectors of Q multiplied by 
the square-root of the corresponding eigenvalues. 

We summarize the steps enabling a finite-dimensional 
solution to the IMV problem in the following theorem. 

Theorem 3: Consider the system of equations ( |69] l where 
D = [€> SI/] is a dictionary consisting of two orthonormal 
bases with coherence /i(^','4') = max^.,, Suppose 
( |69] l has a fc-sparse solution set 7(A) with support set S. 
If the Kruskal rank ^(D) > 2fc, then 7(A) is unique. In 
addition, let V be a matrix whose column-space is equal 
to span(x(A)). Then, the linear system V = DU has a 
unique fc-sparse solution U whose row support is equal to 
S. Denoting by D5 the columns of D whose indices belong 
to S, the non-zero elements 7"^ (A) are given by 7'^ (A) = 
Dt;x(A). Finally, if 



fc < 



(73) 



then cr(D) > 2fc and the unique sparse U can be found by 
solving ( |67] | with any vector norm. 

D. Analog Dictionaries 

In Section FV-B I we showed that the analog decomposition 
problem ( |56] | is equivalent to (l65T l. The later is very similar to 
the IMV problem ( |69l ). Indeed, we seek a continuous set of 
vectors 7 with joint sparsity that have the smallest number of 
non-zero rows, and satisfy an infinite set of linear equations. 
However, in contrast to ( |69] l, the matrix in ( |65] ) depends on 
Lu. Therefore, Theorem [3] cannot be applied since it is not 
clear what matrix figures in the finite MMV representation. 
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Nonetheless, the essential idea of separating the support 
recovery from that of the actual values of jie-''^) is still 
valid. In particular, we can solve ( l65T l by first determining 
the support set of 7(e^'^). Once the support is known, we 
have that 

f^en = (Df (e^"")Ds(e^-))-iDf (e^''^)c(e^-), (74) 

where D(e^'^) is defined by ( |66] |. The inverse in (|74] | exists 
if M0^(e^'^)) is smaller than 1/fc. From Proposition |5] 
it is sufficient to require that ^I*) < 1/fc. 

To find the support set 5 we distinguish between two 
different cases: 

1) The constant case in which M0^(e^") of (|64] | can be 
written as 

M^^(e^'") = AZ(eJ''^). (75) 

Here A is a fixed matrix independent of uj, and Z(e^'^) 
is an invertible diagonal matrix with diagonal elements 
Ze{e^'^); the columns of A are normalized such that 
esssup|Z^(e^'^)| = 1 for all £. 

2) The rich case in which the support of every subset of 
■j{e^'^) of a given size M, is equal to the support 5 
of the entire set. 

The first case involves a condition on the dictionary. The sec- 
ond allows for arbitrary dictionaries, but imposes a constraint 
on the expansion sequences. This restriction is quite mild, 
and satisfied for a large class of dictionaries and signals. In 
both cases we show that the support can be found by solving 
a finite-dimensional optimization problem. 

Constant case: We begin by treating the setting in which 
the sampled cross correlation matrix can be written as in 
( ITSI i. For example, consider the case in which A is the space 
of real signals bandlimited to ttN/T, as in Section |IV] Then 
^i{t),iptit) defined_by gl]) satisfy ^ (for w > 0) 
with A = (l/\/]V)F, where F denotes the x Fourier 
matrix and Z^(eJ") = e^p{juj{£ - 1)/N}. 

The unitarity of M0^(e-''^), which follows from Propo- 
sition |5l implies that A = M0.0(e-'")Z~i(e-'") must be 
unitary as well. Indeed, for all uj, we have 



A^A 



(76) 



Therefore, \Zi{e^'^)\ is independent of uj. Since 
max„ |Zf(e-''^)| = 1, we conclude that \Zi{e^'^)\ = 1 
for all UJ so that Z{e^'^)Z^ {e^'^) = I, which together with 
(|76] | proves the unitarity of A. 

To obtain a correlation structure of the form (iTSl l we may 
start with a given orthonormal basis {V'f — nT)}, and then 
create another orthonormal basis {(j)e{t~ i^T)} by choosing 

N 



= 1 ra6Z 



Here af[n] is any set of sequences for which Af(e-''^) — 
[A]erZr{e^'^) with A an arbitrary unitary matrix, and Z 
is an arbitrary diagonal unitary matrix. This is a direct 
consequence of the proof of Proposition |5] 



Under the condition ( iTSl l we now show that we can convert 
to a finite MMV problem. Indeed, let the first iV 
elements of 'y{e^'^) be denoted by a.{eJ'^) and the remaining 
iV elements by b(e^'^). Then ( |65l l becomes 



mina,d ||a(eJ-)||2.o + ||d(e^'" 
c(eJ'^) = [ I A ] 



s.t. 



d(e^'") 



(78) 



where d(e 



Z(e^")b(e-''"), and we used the fact that 



since Z{eJ'^) is diagonal and invertible, ||b(e' 



|2,0 



|d(e' 



;.o SO that the two vector sequences have the same 



sparsity. Problem (iTST i has the required IMV form. It can be 
solved by first finding the sparsest matrix U that satisfies 
C = [I A]U where the columns of C form a basis for the 
span of {c(e^'^), — tt < lu < tt}. As we have seen, a basis can 
be determined in frequency by first forming the correlation 
matrix 

/•TT 



Q= / c{e^'^)c"{e^'^)duj. (79) 

J —TT 

Alternatively, we can find a basis in time by creating 

OQ 

Q'= ^ c[n]c"[n]. (80) 

71 — — OO 

The basis can then be chosen as the eigenvectors correspond- 
ing to nonzero eigenvalues of Q or Q', which we denote by 
C. To find U we consider the convex program 



min ||s(U) 
u 



s.t. 



C = [ I A ] U. 



(81) 



Let S denote the rows in U that are not identically zero 
and let j^ln] be the corresponding sequences jg[n],£ g S. 
Then 



3^) 



(DfD5)-iDfc(e-'"-), (82) 



where D = [I A], and S' denotes the rows in 5* between 
1 and 2N. The remaining sequences 7^, ^ ^ S are identically 
zero. Proposition|6]provides conditions under which dSTI ) will 
find the sparsest representation in terms of the coherence 
A) (where we rely on the fact that A is unitary). Since 



[AZ(e 



and 



\Zi{e^'^)\ = 1, we have that 

A.(I,A)=M*,*)- 

We summarize our results on analog sparse decomposi- 
tions in the following theorem. 

Theorem 4: Let 1 < £ < iV} and {ipeit), l<i< 

N} denote two orthonormal generators of a SI subspace A 
of L2 with coherence fi{^, 5'). Let x{t) be a signal in A 
and suppose there exists sequences ae [n] , bi [n] such that 

N 

(77) =^^(a£W</),(^-?^T) + &,[r^]^,(^-n^)) (83) 



i=l n& 



with fc = ||a||2,o + ||b||2,o satisfying k < (\/2 - 
0.5)//x($, Let M0^(e^") be the cross-correlation matrix 
defined by (|64] | and suppose that it can be written as 



AZ(e-''^), where A is unitary and Z{e^'^) is a 



diagonal unitary matrix. Then, the sequences at [n] and bi [n] 
can be found by solving 

ls(ri)|li 



s.t. 



C=[I A] 



s(r2)lli 



(84) 



Here C is chosen such that its columns form a basis for the 
range of {c(e^'^), w G (— tt, tt]} where the £th component of 
c{e^'^) is the Fourier transform at frequency lo of = 
{(t)i{t — nT), x{t)), and s{Ti) is a vector whose £th element 
is equal to ||rf|| where the norm is arbitrary. Let 81,82 
denote the rows of ri,r2 that are not identically equal 0, 
and define D5 = [Ig^ Ag^]. Then the non-zero sequences 
ag [n] ,bi[n],£ E 8 are given in the Fourier domain by 







I 


. bs(e^") _ 







(DfDs)-iDfc(e 

(85) 

In Theorem |4] the sparse decomposition is determined 
from the samples C£[n] — {(f)i{t — nT),x{t)). However, 
the theorem also holds when ci[n] is replaced by any 
sequence of samples {hi{t — nT),x{t)) with hi{t) being 
an orthonormal basis for A such that both M;i0(e^") and 
M?i^(e'''^) are constant up to a diagonal matrix: 

Mh^ien = AiZi(eJ"'^), Mh^,{en = AsZale^''"). 

(86) 

In this case the matrix [I A] in (l84l i should be replaced by 
the matrix [Ai A2]. Once we find the sparsity set 8, the 
sequences that are not zero can be found as in dSSl l with the 
identity in the first matrix replaced by the appropriate rows 
of Z^\e^'^). 

Rich case: We next consider the case of an arbitrary 
D(e-''^), and impose a condition on the sequences je[n\. 
Specifically, we assume that there exists a finite number M 
such that the support set of {7(6-'"*), |z| = AI} is equal 8. 
In other words, the joint support of any M vectors 'y{e^'^' ) is 
equal to the support of the entire set. Under this assumption, 
the support recovery problem reduces to an MMV model and 
can therefore be solved efficiently using MMV techniques. 
Specifically, we select a set of M frequencies uji, and seek 
the matrix F with columns 7^ that is the solution to 



mmp 
s.t. 



I|s(r)||i 

c(eJ"-) = [ I 



M^^ie^'^-) ] 7, 



1 <i < M. 

(87) 

If we choose s(r) as the £1 norm, then dSTb is equivalent 
to M separate problems, each of the form 



mm 
7 



I7II1 s.t. c= [I U ]7, 



(88) 



were c = c{e^'^') and U = M0^(e^"*) is a unitary matrix 
(see Proposition |5]l. From Proposition |2l the correct sparsity 
pattern will be recovered if U) is low enough, which 
due to Proposition |5] can be guaranteed by upper bounding 

In some cases, even one frequency Ui may be sufficient in 
order to determine the correct sparsity pattern; this happens 
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when the support of 7(e^'^') is equal to the support of the 
entire set of sequences 7(e^'^). In practice, we can solve 
for an increasing number of frequencies, with the hope of 
recovering the entire support in a finite number of steps. 
Although we can always construct a set of signals whose 
joint support cannot be detected in a finite number of steps, 
this class of signals is small. Therefore, if the sequences are 
generated at random, then with high probability choosing a 
finite number of frequencies will be sufficient to recover the 
entire support set. 

VI. Extension to Arbitrary Dictionaries 

Until now we discussed the case of a dictionary comprised 
of two orthonormal bases. The theory we developed can 
easily be extended to treat the case of an arbitrary dictionary 
comprised of sequences di{t) that form a frame (fT4l i for A. 
These results follow from combining the approach of the 
previous section with the corresponding statements in the 
discrete setting developed in [12], [13], [14]. 

Specifically, suppose we would like to decompose a vector 
X G in terms of a dictionary D with columns using 
as few vectors as possible. This corresponds to solving 

min||7||o s.t. x = 07. (89) 
7 



Since ( |89] l has combinatorial complexity, we would like 
to replace it with a computationally efficient algorithm. If 
D has low coherence, where in this case the coherence is 
defined by 

then we can determine the sparsest solution 7 by solving the 
£1 problem 

min||7||i s.t. x = D7. (91) 
7 

The coherence of a dictionary measures the similarity be- 
tween its elements and is equal to only if the dictionary 
consists of orthonormal vectors. A general lower bound on 
the coherence of a matrix D of size x m is [14] /i(D) > 
[(m - N)/{N{m - 1))]^/^ The same results hold true for 
the corresponding MMV model, and are incorporated in the 
following proposition [13], [12], [14], [25]: 

Proposition 7: Let D be an arbitrary dictionary with co- 
herence /i(D) given by ( |90l ). Then the Kruskal rank satisfies 
(t(D) > l//i(D) — 1. Furthermore, if there exists a choice 
of coefficients F such that X = DF and 



|s(r) 



lo<i(l 



1 



m(d) 



(92) 



then the unique sparse representation can be found by 
solving ( |67] |. 

We now apply Proposition [T] to the analog design problem. 
Suppose we have a signal x{t) that lies in a SI space A, and 
let {di{t — nT),l < £ < m} denote an arbitrary frame 
for A with m > N. As an example, consider the space A 
of real signals bandhmited to {—ttN/T,ttN/T], which was 
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introduced in Section |IV] As we have seen, this space can 
be generated by the N functions 

1 



sinc((i -{£- l)r')/T'), l<e<N, (93) 



with T' = T/N. Suppose now that we define the functions 

Mt) = sinc((i -{£- l)f)/f), l<£<m, (94) 

where T — T/m and m > N. Using similar reasoning 
as that used to establish the basis properties of the gen- 
erators ( [39] l. it is easy to see that {</)£(<)} constitute an 
orthonormal basis for the space of signals bandlimited to 
(— 7rm/r, TTm/T] which is larger than A. Filtering each one 
of the basis signals with a (scaled) LPF with cut-off tt/T' 
will result in a redundant set of functions 

di{t) = sinc((i ~{£- l)f )/T'), l<£<m, (95) 



/T' 

that form a frame for A [32], [33]. 

Our goal is to represent a signal x{t) in A using as few 
sequences dg {t) as possible. More specifically, our problem 
is to choose the vector sequence •yln] such that 



(96) 



= 1 n£l 



and ||7||2,o is minimized. 

To derive an infinite-dimensional alternative to i9l[ let 
{hi{t)} generate a basis for A. Then x{t) is uniquely 
determined by the N sampling sequences 



Ci[n\ = {h((t — nT), x{t)) = r{{nT), 



(97) 



where ri{t) is the convolution ri{t) — h{~t) * x{t). There- 
fore, x{t) satisfies ( |96] l only if 



Cr[m] = E E 7f Nar^N) (98) 

£=1 neZ 

where ari\n] — {hr{t — nT),di{t)). In the Fourier domain 
becomes 



£=1 



(99) 

Denoting by c(e-''^), 7(e-''^) the vectors with elements 
Ci{e^'^),Tg{e^'^) respectively we can write ( |99] l as 



(100) 



Therefore, our problem is to find the sparsest set of '^{eJ'^) 
that satisfies (llOOI i. 

In order to solve the sparse decomposition problem we 
first treat the case in which {hi{ty} are chosen such that 



MM(e^")=W(e^'")AZ(e^'"), 



(101) 



where A is a fixed matrix independent of lo, Z(e^'^) is an 
invertible diagonal matrix with diagonal elements Zi{e^'^) 



satisfying ess sup \Zi{e^'^)\ — 1, and W(e^'^) is an arbitrary 
invertible matrix. Going back to the bandlimited frame (|95] l it 
can be easily seen that with hg{t) = (f)t{t), ( IIOII ) is satisfied. 
Indeed, 



Ht(uj)Dr{L0) 



T_ ju:(l-l)T/N -3ui(r-l]T/m 



lo&[-ttN/T,itNIT^ 
otherwise. 



Therefore, 



(103) 



where f{(,r) is a function only of the indices £,r and 
not the frequency uj. Choosing Z,.(e-'") = e~J"('^-i)/™ 
and W(e"'") as a diagonal matrix with diagonal elements 
Weiei'^) = eii^(^-^M^ leads to the representation ( fTOTT ). 

When M/idle-'") has the form ( fTOTT i. the system of 
equations dlOOl ) becomes 



d(e^''") = AZ(e^''")7(e^'") = Aa(e^''"), 



(104) 



where we denoted d(e^'^) = W-'^[e?'^)c{e^'^), a(eJ") = 
Z(e^'^)7(eJ'^) and used ( fToH i. Clearly, ||a(e^'^)||2,o = 
l!7(e-''^)||2,o because Ti(e?^) is invertible and diagonal. 
Therefore, the sparse decomposition problem is equivalent 
to finding a(e-'") satisfying ( 1104b and such that ||a(e^")||2,o 
is minimized. 

As in the previous section, the sparsest a(e^'^) can be 
determined by first converting (|104t to a finite MMV prob- 
lem, in which we seek the sparsest matrix U that satisfies 
C = AU where the columns of C form a basis for the span 
of {W-i(eJ")c(eJ"), -TT < < tt}. The matrix U can be 
determined by solving the convex problem 



min||s(U)||i s.t. C = AU. 



u 



(105) 



From Proposition |7] it follows that the unique sparse matrix 
U can be recovered as long as /i(A) satisfies ( |92] i. Once 
we determine the non-zero rows S in U, we can find the 
non-zero sequences ■^^\n\ by noting that from Proposition |7] 
the columns A5 of A corresponding to S are linearly 
independent. Therefore, 



-1^{en = Z^i(e^-)(Af A5)-iAf 



(e^")c(e^''^). 

(106) 

If ( llOlb is not satisfied, but instead 'y{eJ'^) is rich, so 
that the support of every M set of vectors (for M different 
frequencies) is equal to the span of the entire set, then we can 
still convert the problem into an MMV. To do this, we choose 
M frequency values and seek the set of vectors 7^ , 1 < « < 
M with the sparsest joint support that satisfy 



c(e^"0 =MM(e-''"')7.n l<i<M. 



(107) 



Once the support is determined, we can find the non-zero 
sequences 7'^[n] using (1106b . 

We have outUned a concrete method to find the spars- 
est representation of a signal x{t) in A in terms of an 
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arbitrary dictionary. In our proposed approach, the recon- 
struction is performed with respect to the samples Q[n] = 
{he{t — nT), x{t)). We may alternatively view our algorithm 
as a method to reconstruct x{t) from these samples assuming 
the knowledge that x{t) has a sparse decomposition in the 
given dictionary. Thus, our results can also be interpreted as 
a reconstruction method from a given set of samples, and in 
that sense complements the results of [22]. 

VII. Conclusion 

In this paper, we extended the recent line of work on 
generalized uncertainty principles to the analog domain, by 
considering sparse representations in SI bases. We showed 
that there is a fundamental limit on the ability to sparsely 
represent an analog signal in an infinite-dimensional SI space 
in two orthonormal bases. The sparsity bound is similar 
to that obtained in the finite-dimensional discrete setting: 
In both cases the joint sparsity is limited by the inverse 
coherence of the bases. However, while in the finite setting, 
the coherence is defined as the maximal absolute inner 
product between elements from each basis, in the analog 
problem the coherence is the maximal absolute value of the 
sampled cross-spectrum between the signals. 

As in the finite domain, we can show that the proposed 
uncertainty relation is tight by providing a concrete example 
in which it is achieved. Our example mimics the finite 
setting by considering the class of bandlimited signals as the 
signal space. This leads to a Fourier representation that is 
defined over a finite, albeit continuous, interval. Within this 
space we can achieve the uncertainty limit by considering a 
bandlimited train of LPFs. This choice of signal resembles 
the spike train which is known to achieve the uncertainty 
principle in the discrete setting. 

Finally, we treated the problem of sparsely representing an 
analog signal in an overcomplete dictionary. Building upon 
the uncertainty principle and recent works in the area of 
compressed sensing for analog signals, we showed that under 
certain conditions on the Fourier domain representation of 
the dictionary, the sparsest representation can be found by 
solving a finite-dimensional convex optimization problem. 
The fact that sparse decompositions can be found by solving 
a convex optimization problem has been established in many 
previous works in compressed sensing in the finite setting. 
The additional twist here is that even though the problem 
has infinite dimensions, it can be solved exactly by a finite- 
dimensional program in many interesting cases. 

In this paper we have focused on analog signals in SI 
spaces. A very interesting further line of research is to extend 
these ideas and notions to a larger class of analog signals, 
leading to a broader notion of analog sparsity and analog 
compressed sensing. 
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Appendix I 
Proof of Proposition[T] 

To prove the proposition, note that 

1 



\x{t)\'dt 



1 

2^ 



2tt 



N 



\X{uj)\''duj 



duj, (108) 



where the last equality follows from (|7]l. To simplify (1108b 
we rewrite the integral over the entire real line, as the sum 
of integrals over intervals of length 2tt/T: 



X{Lu)dLU 



fc=-c 



^ x(oj^^k)du;, (109) 



T 



for all X{uj). Substituting into (IIO8I 1 and using the fact that 
Ai{e^'^'^) is 27r/r-periodic, we obtain 



\x{t)\^dt 



N 



27r 



k— — oo 
22L N N 



dio 



27r 



22L N 



(110) 



where we used ( fTTI i. 



Appendix II 
Proof of Proposition^ 

To prove the proposition, we first note that since (i>i(t) is 
in A for each i, we can express it as 



'^^W = im°'N^r(t-nr) (111) 

for some coefficients [n] with Fourier transform A\.(e?'^). 
We have shown in the proof of Theorem |2] that the orthonor- 
mality condition JTtI i of i)i{t^ implies that 

Ai{e3^)^R^,^X<^^^). (112) 

Now, since — nT)] is an orthonormal basis for A, 



i?0,0.(e^'^) = bi^r- From (fTTTT i. 

N N 



■m— 1 s—1 
N 



= [M^^ie^^MM^^en]" , (113) 

where [CJ^ denotes the rth row of C. The second equality in 
(III3I 1 follows from the orthonormality of {iJ^eXt ~ '^^)}' ^i^d 



the last equality is a result of ( 11121 ). Since Rcfi^^,^ (e-''^) — <5^,r, 
it follows from ( II 13b that the matrix M0^(e-'") is unitary 
for all u!. 

Since M0^(e''") is unitary, the coherence 
yLt(I, M0^(e''")) is well defined. Now for any 
unitary U, /i(I,U) — maxij\Uij\. In addition, 
^($,*) = maxjjsup^ |[M0.0(eJ")]y |, so that 
M0.0(e-''^)) < completing the proof. 
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